Phrase-based language models for speech recognition

نویسندگان

  • Hong-Kwang Jeff Kuo
  • Wolfgang Reichl
چکیده

Including phrases in the vocabulary list can improve ngram language models used in speech recognition. In this paper, we report results of automatic extraction of phrases from the training text using frequency, likelihood, and correlation criteria. We show how a language model built from a vocabulary that includes useful phrases can systematically improve language model perplexity in a natural language call-routing task and the 20K-Nov92 Wall Street Journal evaluation. We also discuss the impact of such phrase-based language models on recognition word error rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phrase Based Language Model For Statistical Machine Translation

We consider phrase based Language Models (LM), which generalize the commonly used word level models. Similar concept on phrase based LMs appears in speech recognition, which is rather specialized and thus less suitable for machine translation (MT). In contrast to the dependency LM, we first introduce the exhaustive phrase-based LMs tailored for MT use. Preliminary experimental results show that...

متن کامل

Finite-State Approximation of Phrase Structure Grammars

Phrase-structure grammars are effective models for important syntactic and semantic aspects of natural languages, but can be computationally too demanding for use as language models in real-time speech recognition. Therefore, finite-state models are used instead, even though they lack expressive power. To reconcile those two alternatives, we designed an algorithm to compute finite-state approxi...

متن کامل

Statistical Machine Translation and Automatic Speech Recognition under Uncertainty

Statistical modeling techniques have been applied successfully to natural language processing tasks such as automatic speech recognition (ASR) and statistical machine translation (SMT). Since most statistical approaches rely heavily on availability of data and the underlying model assumptions, reduction in uncertainty is critical to their optimal performance. In speech translation, the uncertai...

متن کامل

Stochastic language models for speech recognition and understanding

Stochastic language models for speech recognition have traditionally been designed and evaluated in order to optimize word accuracy. In this work, we present a novel framework for training stochastic language models by optimizing two different criteria appropriate for speech recognition and language understanding. First, the language entropy and salience measure are used for learning the releva...

متن کامل

Nonnative speech recognition based on state-candidate bilingual model modification

The speech recognition accuracy has been observed to decrease for nonnative speakers, especially those who are just beginning to learn foreign language or who have heavy accents. This paper presents a novel bilingual model modification approach to improve nonnative speech recognition, considering these great variations of accented pronunciations. Each state of the baseline nonnative acoustic mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999